Exercise 2: Muon stopping power¶

The stopping power of a particle usually refers to the energy loss rate \(dE/dx\) when it passes through matter. When charged particles travel through our LArTPC detector, they interact with the argon and lose energy.

Note

Currently this tutorial only computes dQ/dx distribution. Need refresh to make a 2D plot of dQ/dx vs residual range.

MIP and Bragg peak¶

Minimally ionizing particles (MIP) are charged particles which lose energy when passing through matter at a rate close to minimal. Particles such as muons often have energy losses close to the MIP level and are treated in practice as MIP. The only exception is when the muon comes to a stop and experiences a Bragg peak.

../_images/bragg_peak.png

Fig. 12 Example of muon Bragg peak. The muon is travelling from bottom left to top right. The color scale represents energy deposition. Red means more energy deposited. The sudden increase in deposited (lost) energy towards the end of the muon trajectory is the Bragg peak. From MicroBooNE (arxiv: 1704.02927)

I. Motivation

We know that the energy loss rate of a MIP in argon is ~2 MeV/cm. Hence our goal is to carefully isolate the MIP-like sections of muons (i.e. leaving out the ends of the track), and compute the (reconstructed) energy loss along these trajectories \(dQ/dx\). This can inform the detector calibration effort, for example, since we can compare the peak of the \(dQ/dx\) histogram with the theoretical expected value (although there are many detector effects that make this not straightforward). We can also study the spatial uniformity of the detector by looking at MIP \(dQ/dx\) values in different regions of the detector, etc. If we plot the dQ/dx change along the so-called “residual range” (i.e. distance to the end of the muon trajectory), we get a characteristic plot (due to the Bragg peak). In this tutorial we will focus on reproducing this plot.

../_images/residual_range.png

Fig. 13 Example of what we expect for a muon dQ/dx versus residual range 2d histogram. The sudden increase in deposited (lost) energy towards the end of the muon trajectory (= low residual range) is the Bragg peak. From MicroBooNE (arxiv: 2010.02390)

II. Setup

Again, we start by setting our working environment. Some necessary boilerplate code:

a. Software and data directory

import os, sys
SOFTWARE_DIR = '%s/lartpc_mlreco3d' % os.environ.get('HOME') 
DATA_DIR = os.environ.get('DATA_DIR')
# Set software directory
sys.path.append(SOFTWARE_DIR)

b. Numpy, Matplotlib, and Plotly for Visualization and data handling.

import numpy as np
import pandas as pd
import yaml

from plotly.offline import init_notebook_mode, iplot
import plotly.graph_objs as go
init_notebook_mode(connected=False)

c. MLRECO specific imports for model loading and configuration setup

The imports below load some auxiliary functions and classes required to run the full chain in interactive mode.

import warnings
warnings.filterwarnings('ignore')
from mlreco.main_funcs import process_config, prepare

Let’s load the configuration file inference.cfg to setup the full chain architecture and weights. This file uses the keyword DATA_DIR to symbolize the path to validation set and checkpoint file. We need to replace it with the actual location defined previously.

cfg = yaml.load(open('%s/inference.cfg' % DATA_DIR, 'r').read().replace('DATA_DIR', DATA_DIR),Loader=yaml.Loader)

To keep things simple we will ask the chain to stay quiet and we can load a bigger dataset than the 100-events default one:

cfg['model']['modules']['chain']['verbose'] = False
# cfg['iotool']['dataset']['data_keys'] = ['/sdf/group/neutrino/ldomine/mpvmpr_012022/test.root']

So far cfg was simply a dictionary loaded from a YAML configuration file. It needs to be consumed by a helper function process_config to be ready for the full chain usage:

process_config(cfg, verbose=False)
Config processed at: Linux tur024 3.10.0-1160.42.2.el7.x86_64 #1 SMP Tue Sep 7 14:49:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$CUDA_VISIBLE_DEVICES="0"

d. Initialize and load weights to model using Trainer.

This next cell loads the dataset and the model to the notebook environment. hs stands for “handlers”. It contains various useful access points, such as the I/O iterator hs.data_io_iter to directly access the dataset or the trainer instance hs.trainer which will enable us to actually run the network.

# prepare function configures necessary "handlers"
hs = prepare(cfg)
# Optionally, you can specifiy a list of images by listing the image entry ID numbers:
# hs = prepare(cfg, event_list=[0, 1, 2])
dataset = hs.data_io_iter
Welcome to JupyROOT 6.22/09
Loading file: /sdf/home/l/ldomine/lartpc_mlreco3d_tutorials/book/data/mpvmpr_012022_test_small.root
Loading tree sparse3d_reco
Loading tree sparse3d_reco_chi2
Loading tree sparse3d_pcluster_semantics_ghost
Loading tree cluster3d_pcluster
Loading tree particle_pcluster
Loading tree particle_mpv
Loading tree sparse3d_pcluster_semantics
Loading tree sparse3d_pcluster
Loading tree particle_corrected
Warning in <TClass::Init>: no dictionary for class larcv::EventNeutrino is available
Warning in <TClass::Init>: no dictionary for class larcv::NeutrinoSet is available
Warning in <TClass::Init>: no dictionary for class larcv::Neutrino is available
Ghost Masking is enabled for UResNet Segmentation
Restoring weights for  from /sdf/home/l/ldomine/lartpc_mlreco3d_tutorials/book/data/weights_full_mpvmpr_012022.ckpt...
Done.

As usual, the model is now ready to be used (check for successful weight loading). Let’s do one forward iteration to retrieve a handful of events.

data, result = hs.trainer.forward(dataset)

III. Setup the analysis tools

The particles in a given image are classified into one of five particle types: photon \(\gamma\), electron \(e\), muon \(\mu\), pion \(\pi\), and proton \(p\). Obtaining particles from the full chain is quite simple: we initialize the FullChainEvaluator for this batch of events and examine the particle composition through the get_particles (for true particles, get_true_particles) method.

from analysis.classes.ui import FullChainEvaluator
# Only run this cell once!
predictor = FullChainEvaluator(data, result, cfg, deghosting=True)
entry = 7    # Batch ID for current sample

IV. Visualize true and predicted particles

Let’s first import plotting functions from lartpc_mlreco3d for easier visualization:

from mlreco.visualization import scatter_points, scatter_cubes, plotly_layout3d
from mlreco.visualization.plotly_layouts import white_layout, trace_particles, trace_interactions, dualplot

The evaluator will handle ghost masking internally, so all you have to do is retrieve the predicted and true semantic segmentation labels for visualization and analysis:

pred = predictor.get_predicted_label(entry, 'segment')
truth = predictor.get_true_label(entry, 'segment')
points = predictor.data_blob['input_data'][entry]
# Check if dimensions agree
assert (pred.shape[0] == truth.shape[0])

Let’s plot the voxel-level true and predicted semantic labels side-by-side:

trace1, trace2 = [], []
edep = points[:, 5]

trace1 += scatter_points(points,
                        markersize=1,
                        color=truth, 
                        cmin=0, cmax=5, colorscale='rainbow')

trace2 += scatter_points(points,
                        markersize=1,
                        color=pred, 
                        cmin=0, cmax=5, colorscale='rainbow')

fig = dualplot(trace1, trace2, titles=['True semantic labels (true no-ghost mask)', 'Predicted semantic labels (predicted no-ghost mask)' ])

iplot(fig)

By default, the label for tracks and michel electrons are 1 and 2, respectively.

track_label = 1
michel_label = 2
particles = predictor.get_particles(entry, only_primaries=False)
true_particles = predictor.get_true_particles(entry, only_primaries=False, verbose=False)
trace1 = trace_particles(particles, color='semantic_type')
trace2 = trace_particles(true_particles, color='semantic_type')
fig = dualplot(trace1, trace2, titles=['Predicted particles (predicted no-ghost mask)', 'True particles (predicted no-ghost mask)'])

iplot(fig)

The predicted particles, each color-coded according to its semantic type, will be displayed in left; the true particles on right.

Step 1: Select stopping muons

We will select track-like predicted particles that are close to a Michel predicted particle. For the stopping muon residual range study purpose, purity is more important than efficiency. Hence we only select stopping muons that decay into a Michel electron.

from scipy.spatial.distance import cdist

def get_stopping_muons(particles, Michel_threshold=10):
    selected_muons = []
    closest_points = []
    
    # Loop over predicted particles
    for p in particles:
        if p.semantic_type != track_label: continue
        coords = p.points

        # Check for presence of Michel electron
        attached_to_Michel = False
        closest_point = None
        for p2 in particles:
            if p2.semantic_type != 2: continue
            d =  cdist(p.points, p2.points)
            if d.min() >= Michel_threshold: continue
            attached_to_Michel = True
            closest_point = d.min(axis=1).argmin()

        if not attached_to_Michel: continue
        
        selected_muons.append(p)
        closest_points.append(closest_point)
        
    return selected_muons, closest_points
selected_muons, closest_points = get_stopping_muons(particles)
selected_muons
[Particle( Image ID=0   | Particle ID=11  | Semantic_type: Track           | PID: Muon     | Primary: 0  | Score = 58.05% | Interaction ID: 12 | Size: 333   )]

And just as a sanity check, let’s run the same function on the true particles to ensure it selects what we expect:

get_stopping_muons(true_particles)
([TruthParticle( Image ID=0   | Particle ID=16  | Semantic_type: Track           | PID: Muon     | Primary: 1  | Interaction ID: 3  | Size: 333   )],
 [13])

Step 2: Muon direction with PCA

Once we have a suitable muon selected, the next step is to bin it. To make the binning easier, we will find the approximate direction of the muon using principal component analysis (PCA) then project all muon voxels along this axis.

from sklearn.decomposition import PCA

def get_PCA(particles):
    pca = PCA(n_components=2)
    pca_coordinates = []
    pca_directions = []
    for p in particles:
        coords = p.points
        # PCA to get a rough direction
        coords_pca = pca.fit_transform(p.points)[:, 0]
        pca_coordinates.append(coords_pca)
        pca_directions.append(pca.components_[0, :])
    return pca_coordinates, pca_directions

Time to run our function on the muon we just selected:

pca_coordinates, pca_directions = get_PCA(selected_muons)

It is always a good idea to visualize the output of each intermediate stage to make sure the output is not garbage for the next step. The muon voxels are in blue, the computed PCA direction is in orange.

trace = []

for idx, (coords, direction) in enumerate(zip(pca_coordinates, pca_directions)):
    trace += scatter_points(selected_muons[idx].points,
                            markersize=1,
                            color='blue')
    # Artificially create a point cloud for the PCA direction visualization
    pca_points = selected_muons[idx].points[coords.argmin()][None, :] + direction[None, :] * (coords[:, None]-coords.min())
    trace += scatter_points(pca_points,
                           markersize=1,
                           color='orange')


fig = go.Figure(data=trace,layout=plotly_layout3d())
fig.layout.scene.xaxis.range=[0, 768]
fig.layout.scene.yaxis.range=[0, 768]
fig.layout.scene.zaxis.range=[0, 768]
iplot(fig)

The orange direction seems to match roughly the muon direction, so we can move on to the next step.

Step 3: Local dQ/dx

Using the rough muon direction, we will define segments by binning the projection of all voxels onto this axis.

def get_dQdx(particles, closest_points, pca_coordinates,
            bin_size=17,
            return_pca=False): # about5cm
    
    df = {
        "dQ": [],
        "dx": [],
        "residual_range": []
    }
    
    pca = PCA(n_components=2)
    
    list_coords, list_pca, list_directions = [], [], []
    for p, closest_point, coords_pca in zip(particles, closest_points, pca_coordinates):
        coords = p.points
        
        # Make sure where the end vs start is
        # if end == 0 we have the right bin ordering, otherwise might need to flip when we record the residual range
        distances_endpoints = [((coords[coords_pca.argmin(), :] - coords[closest_point, :])**2).sum(), ((coords[coords_pca.argmax(), :] - coords[closest_point, :])**2).sum()]
        end = np.argmin(distances_endpoints)

        # Split into segments and compute local dQ/dx
        bins = np.arange(coords_pca.min(), coords_pca.max(), bin_size)
        bin_inds = np.digitize(coords_pca, bins)

        for i in np.unique(bin_inds):
            mask = bin_inds == i
            if np.count_nonzero(mask) < 2: continue
            # Repeat PCA locally for better measurement of dx
            pca_axis = pca.fit_transform(p.points[mask])
            list_coords.append(p.points[mask])
            list_pca.append(pca_axis[:, 0])
            list_directions.append(pca.components_[0, :])
            
            dx = pca_axis[:, 0].max() - pca_axis[:, 0].min()
            dQ = p.depositions[mask].sum()
            residual_range = (i if end == 0 else len(bins)-i-1) * bin_size

            df["dx"].append(dx)
            df["dQ"].append(dQ)
            df["residual_range"].append(residual_range)
        
    df = pd.DataFrame(df)
    
    if return_pca:
        return list_coords, list_pca, list_directions
    
    return df     

For each segment, we compute

  • dQ (sum of reconstructed charge of all voxels in the segment)

  • dx (for better precision we recompute a local PCA direction on the segment voxels exclusively)

  • residual range (distance from the muon end, well-defined using the Michel contact point)

get_dQdx(selected_muons, closest_points, pca_coordinates)
dQ dx residual_range
0 28384.394867 16.665511 119
1 28005.154846 16.431191 102
2 33860.419662 15.849548 85
3 32149.310333 16.158118 68
4 28180.446045 15.885844 51
5 33344.725494 16.437542 34
6 28328.635315 15.012732 17
7 34864.223022 16.278467 0
8 46960.138428 16.400091 -17

Once again, let’s take the time to visualize the binning and make sure the segments make sense. The black voxels belong to the local PCA directions, the colored voxels belong to different segments (color = segment id).

import matplotlib.pyplot as plt
trace = []

list_coords, list_pca, list_directions = get_dQdx(selected_muons, closest_points, pca_coordinates, return_pca=True)
cmap = plt.cm.get_cmap('Set1')
n = len(list_coords)
for idx, (coords, pca_axis, direction) in enumerate(zip(list_coords, list_pca, list_directions)):
    trace += scatter_points(coords,
                            markersize=2,
                            color='rgb%s' % str(cmap(idx/n)[:-1]))
    #print(direction[None, :] * coords[:, None])
    #print(cmap(idx/n))
    pca_points = coords[pca_axis.argmin()][None, :] + direction[None, :] * (pca_axis[:, None]-pca_axis.min())
    trace += scatter_points(pca_points,
                           markersize=2,
                           color='black')


fig = go.Figure(data=trace,layout=plotly_layout3d())
fig.layout.scene.xaxis.range=[0, 768]
fig.layout.scene.yaxis.range=[0, 768]
fig.layout.scene.zaxis.range=[0, 768]
iplot(fig)

Step 4: Repeat with high statistics

We have computed everything we want, we just need to repeat this with higher statistics.

muons = pd.DataFrame({
        "index": [],
        "dQ": [],
        "dx": [],
        "residual_range": []
    })

from tqdm import tqdm
for iteration in tqdm(range(10)):
    data, result = hs.trainer.forward(dataset)
    predictor = FullChainEvaluator(data, result, cfg, deghosting=True, processor_cfg={'michel_primary_ionization_only': True})
    for entry, index in enumerate(predictor.index):  
        particles = predictor.get_particles(entry, only_primaries=False)
        selected_muons, closest_points = get_stopping_muons(particles)
        pca_coordinates, pca_directions = get_PCA(selected_muons)
        df = get_dQdx(selected_muons, closest_points, pca_coordinates)
        df['index'] = index
        muons = pd.concat([muons, df])
  0%|                                                                                                                                | 0/10 [00:00<?, ?it/s]
 10%|████████████                                                                                                            | 1/10 [00:04<00:39,  4.40s/it]
 20%|████████████████████████                                                                                                | 2/10 [00:09<00:36,  4.60s/it]
 30%|████████████████████████████████████                                                                                    | 3/10 [00:13<00:30,  4.33s/it]
 40%|████████████████████████████████████████████████                                                                        | 4/10 [00:16<00:24,  4.06s/it]
 50%|████████████████████████████████████████████████████████████                                                            | 5/10 [00:20<00:19,  3.97s/it]
 60%|████████████████████████████████████████████████████████████████████████                                                | 6/10 [00:24<00:15,  3.99s/it]
 70%|████████████████████████████████████████████████████████████████████████████████████                                    | 7/10 [00:27<00:11,  3.77s/it]
 80%|████████████████████████████████████████████████████████████████████████████████████████████████                        | 8/10 [00:31<00:07,  3.70s/it]
 90%|████████████████████████████████████████████████████████████████████████████████████████████████████████████            | 9/10 [00:35<00:03,  3.70s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:35<00:00,  2.62s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:35<00:00,  3.54s/it]

muons
index dQ dx residual_range
0 12.0 21799.233643 16.536523 85.0
1 12.0 26032.700577 16.001325 68.0
2 12.0 28997.909004 16.740050 51.0
3 12.0 28033.805969 15.666256 34.0
4 12.0 28577.810974 15.638082 17.0
... ... ... ... ...
9 97.0 35972.538376 16.761004 51.0
10 97.0 33305.298462 16.661945 34.0
11 97.0 36291.415375 16.379231 17.0
12 97.0 42445.251282 17.685512 0.0
13 97.0 7054.667847 1.414214 -17.0

848 rows × 4 columns

Plot dQ/dx vs residual range

import matplotlib
import matplotlib.pyplot as plt
import seaborn
seaborn.set(rc={
    'figure.figsize':(15, 10),
})
seaborn.set_context('talk')
clean = (muons["dx"]*0.3 > 3.5) & (muons["dx"]*0.3 < 50) #& (cells['pca_length']*0.3 > 150)
#clean = np.ones(shape=muons["residual_range"].shape, dtype=np.bool)
plt.hist2d(muons["residual_range"][clean]*0.3, muons["dQ"][clean]/muons["dx"][clean],
          range=[[0, 200], [500, 4000]], bins=[40, 80],
          cmap="viridis")
plt.colorbar()
plt.xlabel("Residual range (cm)")
plt.ylabel("Muon dQ/dx [arbitrary units / cm]")
Text(0, 0.5, 'Muon dQ/dx [arbitrary units / cm]')
../_images/dEdx_54_1.png